Support Vector Machines for Predicting microRNA Hairpins
نویسندگان
چکیده
microRNAs (miRNAs) are 20-22 nt noncoding RNAs which are rapidly emerging as crucial regulators of gene expression in plants and animals. Identification of the hairpins which yield mature miRNAs is the first and most challenging step in miRNA gene prediction. We believe this step can best be achieved with biologically motivated feature design and classification techniques which account for the dependencies inherent in any set of hairpin features. We present DIANA-microH, a tool for predicting microRNA hairpins with high specificity and sensitivity. DIANA-microH implements a Support Vector Machine classifier trained on a set of structural and evolutionary features characteristic of miRNA hairpins. DIANA-microH introduces a unique structural feature motivated by a consideration of how enzymatic cleavage occurs. On test data, the SVM classifier achieved an accuracy of 98.6%. DIANA-microH is applied to chromosome 21 to provide a set of highly probable miRNA hairpins for future laboratory testing.
منابع مشابه
Prediction of human microRNA hairpins using only positive sample learning
MicroRNAs (miRNAs) are small molecular non-coding RNAs that have important roles in the post-transcriptional mechanism of animals and plants. They are commonly 21-25 nucleotides (nt) long and derived from 60-90 nt RNA hairpin structures, called miRNA hairpins. A larger number of sequence segments in the human genome have been computationally identified with such 60-90 nt hairpins, however the m...
متن کاملEvaluation of the Efficiency of Linear and Nonlinear Models in Predicting Monthly Rainfall (Case Study: Hamedan Province)
In this research, we used the support vector machine (SVM), support vector machine combine with wavelet transform (W-SVM), ARMAX and ARIMA models to predict the monthly values of precipitation. The study considers monthly time series data for precipitation stations located in Hamedan province during a 25-year period (1998-2016). The 25-year simulation period was divided into 17 years for t...
متن کاملPredicting cardiac arrhythmia on ECG signal using an ensemble of optimal multicore support vector machines
The use of artificial intelligence in the process of diagnosing heart disease has been considered by researchers for many years. In this paper, an efficient method for selecting appropriate features extracted from electrocardiogram (ECG) signals, based on a genetic algorithm for use in an ensemble multi-kernel support vector machine classifiers, each of which is based on an optimized genetic al...
متن کاملA Comparative Study of Extreme Learning Machines and Support Vector Machines in Prediction of Sediment Transport in Open Channels
The limiting velocity in open channels to prevent long-term sedimentation is predicted in this paper using a powerful soft computing technique known as Extreme Learning Machines (ELM). The ELM is a single Layer Feed-forward Neural Network (SLFNN) with a high level of training speed. The dimensionless parameter of limiting velocity which is known as the densimetric Froude number (Fr) is predicte...
متن کاملSTAGE-DISCHARGE MODELING USING SUPPORT VECTOR MACHINES
Establishment of rating curves are often required by the hydrologists for flow estimates in the streams, rivers etc. Measurement of discharge in a river is a time-consuming, expensive, and difficult process and the conventional approach of regression analysis of stage-discharge relation does not provide encouraging results especially during the floods. P
متن کامل